The benefits of the orthogonal LSM models 
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ABSTRACT. In the last few decades both the vol- 
ume of high-quality observing data on variable stars 
and common access to them have boomed; however the 
standard used methods of data processing and inter- 
pretation have lagged behind this progress. The most 
popular method of data treatment remains for many 
decades Linear Regression (LR) based on the princi- 
ples of Least Squares Method (LSM) or linearized LSM. 
Unfortunately, we have to state that the method of lin- 
ear regression is not as a rule used accordingly namely 
in the evaluation of uncertainties of the LR parameters 
and estimates of the uncertainty of the LR predictions. 

We present the matrix version of basic relations of 
LR and the true estimate of the uncertainty of the LR 
predictions. We define properties of the orthogonal 
LR models and show how to transform general LR 
models into orthogonal ones. We give relations for 
orthogonal models for common polynomial series. 
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1. Introduction 

The development in the field of variable stars re- 
search from Tsessevichs times is enormous. The num- 
ber of known variable stars has arisen by at least two 
orders, as well as the number of their observers and in- 
terpreters. It has arisen both the volume and common 
access to high-quality variable stars observing data and 
computational techniques. The number of new efficient 
statistical techniques and methods that are available 
for everybody thanks to wide spread personal com- 
puters have been developed and published. Neverthe- 
less, the methods used for processing of variable stars 
data mostly have remained the same as those used in 
Vladimir Platonovichs era. 

Every astrophysicist likes large quantities and better 
quality of modern observational data, new methods of 
processing are not so popular. Majority of them needs 
a good knowledge of matrix calculus, what is in dis- 



cordance with a frequent syndrome of variable stars 
observers, which could be named Matrixphobia. Very 
rarely we are encountering with the opposite syndrome 
of Matrixphilia which invades mathematically erudite 
theoreticians loving new methods and matrices so much 
that they do not use them for real observational data. 
Both extremes in the data processing are bad and we 
should find our golden mean. 

The contemporary statistics shares inexhaustible 
quantity of methods. It is necessary to select several of 
the most versatile and diverse methods, master them 
and to learn to combine them. The method of process- 
ing must not be unique, but always must be made-to- 
measure of the set problem. 

The majority of variable stars data processing tasks 
are solved using least square method, strictly speaking 
linear regression, where as models serve the most fre- 
quently common polynomials or sine/cosine series. It 
should be noted that there exist several other meth- 
ods which are able to give the same or better re- 
sults. One of them is for example the Advanced Prin- 
cipal Component Analysis, which is the combination of 
LSM and standard Principal Component Analysis (see 
Mikulasek, 2007). The method is optimal for solving 
of a lot astrophysics problems as a realistic fitting of 
multicolour light curves, the determination of the mo- 
ments of extrema of multicolour light curves, modeling 
of light multicolour curves which is necessary for the 
process of improvement of ephemerides, diagnostics of 
light curve (LC) secular changes, and the classification 
of LCs. Other methods of modern data treatment are 
also mentioned in Andronov, I., these Proceedings. 

In the following section we will pay attention to 
some details of linear regression procedure which is 
very likely the most frequently used tool of variable 
stars data processing. 

2. The Least Squares Method 

The very frequent astrophysical task is to fit a curve 
through a series of N observed points described by a 
triad {xi, yi,Wi}, where Xi is an independent (well mea- 
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sured) quantity like time or a phase, related to the z-th 
measurement yi is the dependent, measured quantity 
like magnitude, O— C, and Wi is the weight of the mea- 
surement, as a rule inversely proportional to the square 
of the expected uncertainty of the value yi. Hereafter 
we will use normalized weights Wi the mean value w of 
which is equal to 1. 

F(x,(3) is so called model function of a; described by 
the k free parameters /?i, /?2, • ■ • , /3fe arranged into the 
vector j3. We define a function of this vector S{(3): 



N 



(1) 



The solution of the LSM minimalization procedure, is 
finding of the vector of parameters /3 = b, for which is 
the quantity S'(/3) minimal. The success of the method 
in the given situation depends above all on our skill in 
the creating of the mathematical model expressed by 
the function F{x,P). Then the finding of the best fit 
in the range of functions admissible by the pre-selected 
model is relatively simple and straightforward. In prin- 
ciple it is solution of k equations of k unknown param- 
eters arranged in the vector b: 



dp 



grad 



/3=b 



S{I3 = b) 



0. 



(2) 



^^ dF{xi,h) s^ j^t ,^^F{x^,h) 



dp, 
for j = 1,2,...,A:. 

2.1. Linear regression 



The LSM procedure of the determination of the solu- 
tion will be considerably simplified if we use the linear 
model of the found function F{x,P), assuming: 






(4) 



where fj(x) are arbitrary functions of x. Eq.[T] then 
can be rewritten in the form: 
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Eq.[3]then switches to 






(5) 
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Xl2^^/j(^*)^* ^Yl 



Ybpfp{x.i) 



z— 1 Lp— 1 



fj(xi)Wi, (6) 



It is advantageous to express all operations in matrix 
form. Then 



X = 



/ h{xi) h{xi) 

h{x2) f2ix2) 



\ fl(xN) f2{xN) 



.fk(xi) ^ 

fk{x2) 



fk{xN) / 



(7) 



Y = (yi y2 • • • yN) ; W = diag {wi W2 ■ ■ ■ wjv) , (8) 



H=(X'^WX) \ b = HX'^WY, Yp = Xb, (9) 
i? = Y'^WY-b'^X'^WY, s = 



R 



{N -k) 



, (10) 



where Yp is the vector of the predictions, w is the mean 
value of weights Wi, R is the weighted sum of square 
deflections, s is the weighted standard deviation of the 
fit. 

The procedure of linear regression with the explicit 
linear model is quick and its solution is unique. 
In the general case we may find several solutions 
although some of them could be physically unreal. 
The most common method of finding of local minima 
on the S{P) plane is an iterative gradient method, 
where we use the above mentioned apparatus of lin- 
ear regression applied on the linearized model function. 

2.2. Linearized regression 

The linearization of the general model function 
F(x, P) consists in substitution of it by its Taylor ex- 
pansion in respect of /3. We need to know as good as 
possible estimate be of the solution of LSM equations 
b, be — > b. Then we can write: 

F{x., P) - F{x.., be) + ^ ^^%^(/3, - &c,). (11) 



N 



s{P) = Y. 



fc 



dp, 






(12) 



where 



Ayi = y, 



F{xi,he), 



.Ux) 



5f(x,be) 
dp, ■ 



A/3 = /3 - be 



(13) 



The equations Eq.[5] and Eq.[T2] are formally iden- 
tical, despite the meanings of particular terms in 
them are different. We define column vector AY — 
[Ay I Ay2 ■ ■ ■ yN] j and the column vector of the correc- 
tion of the solution estimate b„, Ab. 



H=(X'^WX) \ Ab^HX^WAY, 

R = AY^WAY, s : 



R 



{N -k)' 



(14) 
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Figure 1: The illustrative figure displays the time de- 
pendence of an observed quantity measured with the 
accuracy denoted by the abscissa. The continuous line 
represents LSM fit by the polynomial of the 3-rd order 
(cubic parabola). Expected uncertainties of this pre- 
diction calculated by the formula Eq.[Tn] are denoted 
by dotted lines, true uncertainties given by Eq.[T7] are 
signed by dashed lines. 



Correcting be by Ab we get the next solution estimate 
be and we can repeat the whole procedure several 
times. The convergence of accordingly selected LSM 
model function is as a rule very swift: after a few steps 
we state that Ab -^ 0, hence b = be. 



2.3. Uncertainties of parameters and prediction 



5bj defined by Eq.[T5]will be supported by our attempt 
use these errors for the evaluating of the expected un- 
certainty of the prediction by the model function for 
the arbitrarily selected value of x: 



Sypix) 



\ 



k 

E 

i=i 



s'b.in^) 



g(x)Hdgg^(x), (16) 



where Hdg equals to the matrix H, whose all non- 
diagonal elements has been put zero. g(a;) is the row 
vector of the gradient of the solution model function 
F(a;,b),g(x) = [/i(x)/2(x) . . . h[x)] 

The instructive picture Fig.[l]will show you that this 
intuitive relation gives quite inadequate results. Never- 
theless, it can be shown that it is valid formally rather 
similar relation: 



5yp{x) = Vg(x)Hg'^(a;). 



(17) 



The matrix H is by the definition (see Eq.[9] and 
T4|) a symmetric square kxk matrix with k{k -I- l)/2 
independent elements. If we want to enable to any- 
body to compute the uncertainty of the prediction, 
we should publish either the whole matrix H or its 
non-trivial part at least. Nevertheless, there is another 
(more illustrative) possibility: to transform the model 
function into the orthogonal one. Then the matrix H 
will change in the diagonal one and the uncertainties 
of parameters will acquire its standard meaning. It 
will help you among other things expertly examine 
importance of individual terms. 



There are at least three reasons why we should esti- 
mate the measure of uncertainty of the found param- 
eters. Firstly, errors of parameters tell us a lot about 
the reliability of our results, secondly uncertainties of 
parameters would enable to calculate the uncertainty 
of the prediction done on the basis of our LSM analy- 
sis, and last but not least above mentioned errors are 
strictly demanded by teachers, scientific editors and 
referees. All LSM instructions and codes congruently 
get for uncertainty of the j'-the parameter 5b j the fol- 
lowing relation: 



5h, 



(15) 



where Hjj is the j'-th element in the diagonal of the 
matrix H. 

It is a question whether 5b j really expresses the un- 
certainty in the common sense. The response is no, 
strictly speaking sometimes yes, but very rarely. It can 
be demonstrated on the error of the absolute term in 
the LSM fit by straight line, which evidently depends 
on the choice of the origin of x coordinate. 

The suspicion that there is something incorrect in 
our comprehension of the true meaning of the quantity 



3. Orthogonal LSM models 

Let us assume that the functional dependence of ob- 
served quantities y on a; is well described by the model 
function which can be expressed in the form of the 
linear combination of k basic functions of fj{x) with 
coefficients hj. The found solution does not change 
if we use another set of k functions •dj{x), which are 
created as linear combinations of the basic functions 
fj{x). Let us combine them so that the new set of ba- 
sic functions dj {x) is orthogonal. It means we find the 
set of coefficients {apj}: 



■dp{x) = 2_^o.pj fj{x): so that, (18) 
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The calculation of linear regression parameters and 
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their uncertainties is then very simple: 



^N 



Syp{x) 



5b, 
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J2 S%^',{^) 



(20) 



The set of coefficients {apj} fulfilling constraints 
Eq.[Tn] is not unique as well as the procedures of its 
finding. We recommend to use the following procedure 
which seems to us the simplest one: 



i9i = /i; 1^2 = /2 - a2i^i; 

^3 = /s - a32??2 - asi^i; 
p-i 
■^p(a;) = /p(a;) -^apq-dq{x), 
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where 
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The first three orthogonalized terms will be: 



^i(x)=/i(x); M^) = f2ix) 



Ml 



Figure 2: The subsequent approximations of the fit of 
observed data by orthogonal polynomial regression. 



before the application of the orthogonalization proce- 
dure. It will result in the considerable simplification in 
the form of regression model. Assuming now x — Q the 
first four orthogonal polynomials are as follows: 



'i?i(x) — 1; ^2(2:) — X] "d^ix) — X — ^x — x^, 

x^ 



i}i{x) = x^ - 



Mx) = h{x) - ^^^ y h{x) 



x"^ x^ -\- x^ x'^ — x"^ x^ 
x^ -V x^ — x^ x** 
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/3/1 /2/1 (,/3/2 - /3 h) 



n 



n[!i~h 



x^ x^ -|- x^ X* — j;'' — x-^ x"^ 
x"^ + x^ — x"^ x^ 



x^ x^ + x^ — 2x^ x^ 
x^ ^ x^ — x^ x^ 



fi{x). 



(23) where, 



The explicit expression of successive terms of a set 
of the orthogonalized functions is more and more 
complex, however it is not very complicated to write 
an iterative PC code enabling to compute the formulae 
for arbitrary number of parameters. 

3.1. Orthogonal polynomial model 

The most popular linear regression model (not only 
in astrophysics) F{x,P) is: 






(25) 



Fig. [2] displays the results of subsequent fitting of the 
model situation by constant, linear, quadratic and cu- 
bic orthogonal polynomials. 

If the data are distributed uniformly in the interval 
Xi e (~A; A), we can use the transformed Legendre 
polynomials (orthogonal on the interval (—1; 1)) as the 
orthogonal (or quasiorthogonal) LSM model: 



A^ 3 A^ 

^1 = 1; i?2 = X] ^z^x" -—■ di = x^ -^ 

3 5 



i^(x,/3)=^/3ja;^-i. 
j=i 



(24) 



^5 = a;' 



6A2 



3A4 
~35" 



(26) 



The model is known to have a lot uncomfortable prop- 
erties which complicate both the calculation and the 
interpretation of found results. We should never used 
it without orthogonalization. 

We recommend to put the origin of x-coordinates 
into the center of gravity of observations: x —^ x — x 



3.2. Orthogonal sine, cosine model 

The basic tool for the analysis of cyclic and periodic 
processes in astrophysics is the linear regression with 
the model consisting of simple periodic functions, the 
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most commonly: 

q 

F{ip, /3) = /5i + ^ (32j cos{2Trjip) + /32j+i sin(27rjV), 

(27) 
where ip is the phase as an independent variable, q 
is the order of set of harmonic functions. The model 
need not contain all of functions, it might be limited 
e.g. only to even functions etc. 

In the case that the observations are spread over the 
whole cycle more or less uniformly, it is not needed 
to do any orthogonalization, because all functions are 
orthogonal itself. In the opposite case we should do 
orthogonalization e.g. by the procedure described by 
Eq.[2T]andEq.[ll 

4. Conclusions 

We displayed the benefits of consequential usage of 
orthogonal LSM model functions with the emphasis on 
the polynomial regression as the chief tool of astrophys- 
ical data processing. Orthogonal models enable to give 
the true sense to errors of found parameters and easily 
compute estimates for uncertainties of the prediction. 
The orthogonality of the models removes the bad con- 
ditioning of the solved systems of equations and help 
us to obtain results not affected by computational er- 
rors. We recommend to use them always, compulsorily 
in the case of polynomial regression. 

It is demanding to use new methods of variable stars 
data processing which enable us better exploit informa- 
tion hidden in observations. Endeavor connected with 
mastering of them will return in new subtle discoveries 
and revealing. 

Matrix calculus, true using of weights, advanced 
principal component analysis, factor analysis, robust 
regression, creation and usage of orthogonal models 
and several other processing techniques should ap- 
pertain to compulsory outfit of each variable stars 
observer of the 21st century. 
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